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INTRODUCTION 


This  work  is  preliminary  since  funding  for  this  piece  was  only  received  6  months  ago. 

We  have  so  far  managed  to  complete  a  revision  of  the  Borealis  code  base  making  it  much 
more  useable  for  general  applications.  We  have  worked  closely  with  personnel  from 
USARIEM  to  identify  special  processing  needs  for  PAN’s  in  a  WPSM  setting  and  have 
redesigned  major  portions  of  the  Borealis  code  base  to  reflect  this.  It  takes  a  novel 
position  with  respect  to  dealing  with  failure  in  a  sensor  network. 

BODY 

Distributed  Stream  Processing  and  Sensors 

The  Aurora  stream  processing  engine  was  a  very  valuable  exercise  to  gain  xmderstanding 
about  the  basic  technical  questions  that  must  be  addressed  by  any  stream  processing 
engine.  These  issues  included  memory  management,  tuple  scheduling,  and  load  control. 
Aurora  was  designed  to  run  on  a  single  server,  and  its  primary  optimization  goal  was 
low-latency  processing.  This  set  of  assumptions  was  chosen  because  it  gave  us  the  best 
opportumty  to  study  the  fundamentals,  and  it  was  useful  for  a  large  class  of  monitoring 
applications. 

We  have  been  designing  a  new  architecture  for  the  Borealis  stream  processor  ttiat  more 
closely  matches  the  requirements  of  sensor  processing.  In  particular.  Borealis  addresses 
distributed  operation  as  well  as  optimization  for  power  consumption  and  bandwidth. 

In  the  case  of  distributed  operation,  we  are  addressing  the  issues  of  automatic  load 
balancing  and  fault-tolerance.  In  this  setting,  load  is  imposed  by  the  network  of  operators 
that  is  processing  a  set  of  input  streams.  Thus,  load  balancing  consists  of  algorithms  for 
moving  operators  from  one  computing  node  to  another,  while  the  system  is  running. 

Fault  tolerance  takes  several  forms.  In  its  more  classic  form,  it  supports  redundant 
computing  elements  that  are  synchronized  in  such  a  way  that  a  primary  node  can  “fail 
over”  to  a  secondary  node  when  a  failure  is  detected.  To  date  no  one  has  adapted  high- 
availability  algorithms  to  operate  efficiently  in  a  streaming  context.  We  have  done  that 
and  have  published  a  paper  [HBR05]  on  the  topic. 


Classic  high-availability  is  too  strong  a  guarantee  for  many  sensor-based  environments. 
Instead,  an  approach  that  adapts  to  failures  by  trading  accuracy  or  confidence  for 
continued  operation  is  more  appropriate.  For  example,  in  a  sensor-based  environment,  it 
is  more  reasonable  to  react  to  a  failure  by  perhaps  reducing  the  accuracy  of  the  results.' 
Of  course,  a  good  adaptive  algorithm  will  minimize  the  loss  in  confidence  by  using  its 
resources  intelligently.  We  have  written  and  published  a  paper  [TBH05]  on  this  topic 
jointly  with  folks  from  USARIFM. 

We  have  been  working  with  scientists  at  the  Army  Soldier  Systems  Labs  in  Natick,  MA 
on  a  problem  of  sensor  network  data  management.  The  goal  is  to  find  a  way  to  fit  the 


WPSM  problem  into  the  Borealis  framework.  In  order  to  do  this,  some  extensions  are 
required  to  the  Borealis  architecture. 

In  particul^,  we  have  come  up  with  a  way  of  capturing  multiple  processing  models  for  a 
given  physiolo^cal  state.  The  system  will  select  which  processing  model  to  use,  based 
on  the  availability  of  inputs.  It  will  also  adjust  sampling  rates  (for  sensors  that  can  do  so) 
dynamically  in  order  to  place  the  confidence  in  an  acceptable  range. 

Borealis  Development  Progress 

We  have  put  some  effort  into  improving  the  infrastructure  of  the  Borealis  code  base.  The 
installation  and  build  processes  have  been  enhanced  and  documented  to  make  them  easier 
to  learn  and  use.  The  source  code  directory  tree  has  been  restructured  to  make  the  code 
base  extensible  and  to  optionally  build  modules.  I  upgraded  the 
code  to  work  with  the  latest  versions  of  external  code  packages  and  tools. 

Scripts  have  been  wntten  to  access  the  revision  control  system.  They  are  easier  to  use 
than  raw  commands  and  are  more  reliable  as  they  detect  collisions  that  occur  when 
multiple  developers  are  working  on  the  same  module. 

Borealis  has  been  changed  to  remove  many  hardware  dependencies.  Portable  schema 
t3^es  were  introduced  for  use  by  Borealis  applications.  To  make  the  system  source  code 
itself  more  portable,  machine  independent  data  types  have  been  declared  and  code  has 
been  modified  to  use  them. 


A  design  and  development  effort  is  underway  to  provide  a  new  programming  interface 
for  applications.  The  new  XML-based  interface  will  accommodate  new  features  and  will 
be  much  easier  to  use  than  the  current  interface.  The  goal  is  to  provide  the  tools  and 
documentation  to  enable  end  users  to  write  Borealis  applications  with  a  minimntn  of 
effort.  A  distributed  catalog  is  also  in  the  design  phase.  It  will  replace  the  current  central 
catalog  so  that  it  will  scale  up  for  multiple  processors. 

A  website  for  developers  has  been  created  to  disseminate  information  that  on  Borealis 
development  issues  and  to  exchange  details  about  ongoing  projects.  The  website  is 
located  at: 


http : / /WWW. cs ■ brown . edu/research /boreal i s/developer/ 


KEY  RESEARCH  ACCOMPLISHMENTS 

•  A  redesign  of  the  Aurora  stream  processing  engine  (now  called  Borealis)  to 
manage  complex  resources  in  a  distributed  environment. 

•  A  new  theory  of  how  to  do  confidence-based  resource  management  in  a  failure- 
prone  environment. 


•  A  simulator  to  assist  USARIEM  in  understanding  various  parameterizations  of 
their  wireless  PAN. 

REPORTABLE  OUTCOMES 

•  A  simple,  demo-able  version  of  Borealis 

•  A  well-received  demo  of  Borealis  at  this  year’s  SIGMOD,  the  most  prestigious 
database  conference. 

•  Helpful  feedback  to  USARIEM. 

CONCLUSIONS 

So  far,  we  have  been  very  successful  at  producing  a  novel  infrastructure  for  distributed 
stream  processing.  We  are  in  the  process  of  fitting  it  into  a  sensor-based  environment  at 
US.^ffiM.  The  system  will  be  able  manage  power  consumption  and  bandwidth 
utilization  automatically.  It  will  tradeoff  accuracy  (confidence)  with  the  use  of  resources. 
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